Benchmarking Next Generation Hardware Platforms: An Experimental Approach
نویسندگان
چکیده
Heterogeneous multi-cores–platforms comprised of both general purpose and accelerator cores—are becoming increasingly common. Further, with processor designs in which there are many cores on a chip, a recent trend is to include functional and performance asymmetries to balance their power usage vs. performance requirements. Coupled with this trend in CPUs is the development of high end interconnects providing low latency and high throughput communication. Understanding the utility of such next generation platforms for future datacenter workloads requires investigations that evaluate the combined effects on workload of (1) processing units, (2) interconnect, and (3) usage models. For benchmarks, then, this requires functionality that makes it possible to easily yet separately vary different benchmark attributes that affect the performance observed for application-relevant metrics like throughput, end-toend latency, and the effects on both due to the presence of other concurrently running applications. To obtain these properties, benchmarks must be designed to test different and varying, rather than fixed, combinations of factors pertaining to their processing and communication behavior and their respective usage patterns (e.g., degree of burstiness). The ‘Nectere’ benchmarking framework is intended for understanding and evaluating next generation multicore platforms under varying workload conditions. This paper demonstrates two specific benchmarks constructed with Nectere: (1) a financial benchmark posing low-latency challenges, and (2) an image processing benchmark with high throughput expectations. Benchmark characteristics can be varied along dimensions that include their relative usage of heterogeneous processors, like CPUs vs. graphics processors (GPUs), and their use of the interconnect through variations in data sizes and communication rates. With Nectere, one can create a mix of workloads to study the effects of consolidation, and one can create both singleand multi-node versions of these benchmarks. Results presented in the paper evaluate workload ability or inability to share resources like GPUs or network interconnects, and the effects of such sharing on applications running in consolidated systems.
منابع مشابه
Predicting Application Resource Requirements in Virtual Environments
© Predicting Application Resource Requirements in Virtual Environments Timothy Wood, Ludmila Cherkasova, Kivanc Ozonat, Prashant Shenoy HP Laboratories HPL-2008-122 virtualization, application resource usage, benchmarking, modeling, automation, performance models, regression-based approach Next Generation Data Centers (NGDC) are transforming labor-intensive, hard-coded, siloed systems into shar...
متن کاملReconfigurable Computing Systems Used To Support Next Generation High Speed Applications
Emerging dimensions of scientific computing have changed the structural requirements of the under laying hardware and software resources. The growing scientific applications are demanding for the high speed computing platforms having the capability of the run time architectural updation. The conventional computing approaches using application specific integrated circuits and programmable genera...
متن کاملBenchmarking Methodology for Embedded Scalable Platforms
Embedded scalable platforms (ESP) are a novel generation of platform architectures that yield optimal energy-performance operations while supporting a diversity of embedded application workloads. A companion methodology combines full-system simulation, pre-designed HW/SW interface libraries, high-level synthesis and FPGA prototyping to enable an effective design-space exploration which is drive...
متن کاملA Case Study - Scaling Legacy Code on Next Generation Platforms
This research note discusses a case study that summarizes the procedure followed in porting a legacy code to scale on next generation platforms. Here, the legacy Laplace mesh smoothing algorithm is used on a hex mesh as a hotspot for the performance study. The case study was conducted on a testbed that has a similar hardware architecture with many integrated cores (MIC) as that of the next gene...
متن کاملFrom Streaming Models to FPGA Implementations
Application advances in the signal processing and communications domains are marked by an increasing demand for better performance and faster time to market. This has motivated model-based approaches to design and deploy such applications productively across diverse target platforms. Dataflow models are effective in capturing these applications that are real-time, multi-rate, and streaming in n...
متن کامل